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An  examination  of  software  system  safety  analysis  has 
been  made  and  generalized  techniques  examined.  These  tech¬ 
niques  parallel  the  techniques  used  for  hardware  analysis 
and  are,  in  fact,  predicted  on  the  fact  that  the  only  safety 
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hardware  component.  Discussion  is  presented  for  a  top  to 
bottom  and  a  bottom  up  hierarchical  analysis,  as  well  as  an 
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I.  BACKGROUND 


Although  the  safety  of  the  product  has  always  received 
some  consideration,  albeit  possibly  tacit,  the  formal, 
systematic  safety  programs  as  we  know  them  today  did  not  come 
into  being  until  the  early  1960's.  The  one  exception  to  this 
was  the  very  strict  safety  controls  established  by  the  Atomic 
Energy  Commission  on  the  use  and  exposure  to  nuclear  materials. 

The  first  system  safety  documentation  requirement  was  the 
United  States  Air  Force  Ballistic  Systems  Division  Exhibit 
62-41,  "System  Safety  Engineering  for  the  Development  of  Air 
Force  Ballistic  Missiles",  published  in  April  1962.  This 
document  established  System  Safety  requirements  for  the  Asso¬ 
ciate  Contractors  on  the  Minuteman  missile  program. 

In  September  1963,  the  United  States  Air  Force  Specifica¬ 
tion,  MIL-S- 38130  (USAF) ,  "General  Requirements  for  Safety 
Engineering  of  Systems  and  Associated  Subsystems  and  Equipment" 
was  promulgated  as  the  first  military  requirement  for  the 
engineering  safety  of  general  systems.  This  specification  was 
closely  followed  in  October  1963  by  the  Navy's  similar  (and 
nearly  duplicate)  requirement,  MIL-S-38130 (WEPS) .  These  two 
documents  were  later  merged  into  a  joint  specification, 
MIL-S-38130 (ASG) ,  and  in  June  1966,  this  specification  became 
a  Department  of  Defense  (DoD)  requirement,  MIL-S- 38130A. 

In  July  1969,  the  System  Safety  Specification  was  revised 
into  a  Military  Standard,  MIL-STD-882  and  in  July  1977, 


1 


T 


MIL-STD-882A  expanded  and  defined  in  more  specific  terms  the 
System  Safety  Program  requirements. 

In  December  1978,  DoD  Instruction  5000.36,  "System 
Safety  Engineering  and  Management"  stated  that:  "The  Heads 
of  DoD  components  shall  establish  system  safety  programs  and 
apply  Military  Standard  882A. ...for  each  major  system  acqui¬ 
sition  of  other  systems  and  facilities,  as  appropriate,  based 
on  the  severity  of  associated  hazards  and  the  potential  for 
loss  or  damage . . . " . 

The  purpose  of  the  System  Safety  Program,  as  stated  in 
MIL-STD  882A  is  "To  provide  uniform  requirements  for 
developing  and  implementing  a  system  safety  program  of 
sufficient  comp rehens iveness  to  identify  the  hazards  of  a 
system  and  to  ensure  that  adequate  measures  are  taken  to 
eliminate  or  control  the  hazard". 

The  general  requirements  of  the  System  Safety  program 
are  that  safety,  consistent  with  mission  requirements,  is 
built  into  the  system  in  a  timely,  cost  effective  manner; 
that  hazards  associated  with  each  system  are  identified, 
evaluated,  and  eliminated  or  controlled  to  an  acceptable 
level  throughout  the  entire  life  cycle  of  the  system;  and 
that  retrofit  actions  required  to  improve  safety  are  mini¬ 
mized  through  the  timely  inclusion  of  safety  features  during 
the  development  and  acquisition  of  a  system. 

Each  of  the  above  listed  requirements  deserves  special 
attention  and  comment.  For  example,  it  is  to  be  noted  that 
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the  Military  Standard  calls  for  safety  consistent  with 
mission  requirement,  not  "safety  at  any  cost".  The  util¬ 
ity  of  the  system  in  the  expected  use  environment  is  still 
the  prime  factor  for  consideration,  and  the  design  safety 
is  intended,  not  to  inhibit  the  mission,  but  rather  to  en¬ 
hance  the  accomplishment  of  the  mission. 

It  is  also  stated  that  hazards  are  to  be  identified, 
evaluated  and  eliminated  or  controlled  to  an  acceptable 
level.  This  is  the  essence  of  what  might  be  called  the 
System  Safety  Process.  The  use  of  historical  data,  simu¬ 
lation,  synthesis  and  test  and  evaluation  are  required  to 
identify  hazards  of  the  system  while  the  system  is  still 
in  the  design  process. 

The  goal  of  this  action  is  to  minimize  the  risk,  based 
on  hazard  severity,  hazard  probability,  criticality,  cost, 
time,  resources  and  mission,  throughout  the  lift  cycle 
including  disposition  and  disposal. 

By  inserting  the  safety  process  as  early  as  possible 
in  the  design  (and  even  concept)  process,  costly  retrofit 
actions  are  reduced  at  a  great  savings  of  money,  operational 
usage  and  limited  resources.  This  takes  safety  from  its 
prior  "Band-Aid"  approach  of  "try  it  and  then  fix  it"  into 
a  new  regime  of  designing  the  safety  into  the  original 
product. 


II.  RISK  ASSESSMENT 


The  general  requirements  of  MIL-STD  882A  include  the 
risk  assessment  procedures  of  conducting  hazard  analyses 
starting  in  the  Conceptual  Phase  and  proceeding  through  the 
production  phase.  These  analyses  include: 

1.  Preliminary  Hazard  Analysis  -  a  "broad  brush" 
look  at  the  potential  hazards  of  systems  and  subsystems. 

The  Preliminary  Hazard  Analysis  (PHA)  is  begun  long  before 
detailed  designs  begin  to  take  shape,  and,  although  quite 
qualitative  in  nature,  the  PHA  provides  a  base  for  other 
types  of  analyses . 

2.  Subsystem  Hazard  Analysis  -  Once  detailed  designs 
are  underway,  more  in-depth  analyses  can  be  conducted  on 
the  subsystems.  The  subsystem  may  be  a  single  component  or 
part,  or  it  may  be  a  complex  mini-system.  Frequently  the 
Subsystem  Hazard  Analysis  (SSHA)  is  conducted  on  a  fairly 
large  block,  and  only  if  the  analysis  indicates  a  high 
degree  of  criticality  is  the  analysis  extended  to  lower 
level  components.  This  is  necessary  due  to  the  time  and 
cost  involved  in  the  analysis  procedure. 

3.  System  Hazard  Analysis  -  Once  an  SSHA  has  been 
completed,  one  has  some  degree  of  confidence  that  the 
critical  hazards  have  been  identified  and  eliminated  or 
controlled  to  an  acceptable  level  in  each  of  the  applicable 
subsystems.  At  this  point  a  Systems  Hazard  Analysis  (SHA) 
is  conducted  to  investigate  the  safety  of  subsystem 


interfaces.  It  is  not  at  all  unlikely  that  the  interrela¬ 
tionships  between  two  "safe"  subsystems  may  result  in  some 
unsafe  action  of  the  combined  system. 

4.  Operating  and  Support  Analysis  -  Although  it  is 
easy  to  forget  during  the  design/production  process  that 
the  ultimate  purpose  is  not  the  development  of  a  system, 
but  is  rather  the  operational  utilization  of  the  system, 
one  must  keep  in  mind  throughout  the  design/production 
phases  that  this  system  must  be  operated  and  supported  to 
meet  its  mission  utility.  This  is  the  purpose  of  Operating 
and  Support  Hazard  Analysis  (O&SHA)  where,  for  the  first 
time,  the  human  element  is  injected  into  the  equation.  The 
O&SHA  considers  operation,  support,  maintenance,  transporta¬ 
tion  and  other  operational  uses  of  the  system. 

The  purpose  of  all  these  analyses  is  to  (a)  identify 
any  potential  hazards,  (b)  evaluate  the  hazards,  (c)  assess 
the  risk  of  the  hazard,  and  (d)  provide  the  necessary  infor¬ 
mation  required  for  the  elimination  or  control  of  the  hazard 
if  the  hazard  is  determined  to  be  critical. 
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III.  HAZARD  ANALYSIS  TECHNIQUES 

Although  the  techniques  to  be  used  in  conducting  hazard 
analyses  are  generally  at  the  option  of  the  contractor, 
several  analysis  techniques  that  may  be  used  are  spelled  out 
in  MIL-STD  882A.  These  are: 

1.  Fault  Tree  Analysis  -  The  Fault  Tree  is  based  on 
the  Logic  Tree  procedure  as  developed  by  the  Bell  Laboratories. 
It  was  originally  modified  by  the  Boeing  Company  to  trace 
fault  developments  in  the  Minuteman  missile  system.  The 
Tree  uses  logic  "gates"  to  build  downward  from  a  Head  or 
Undesired  Event.  Inasmuch  as  a  form  of  the  Fault  Tree 
Analysis  may  be  used  in  Software  analysis,  some  detail  of 
this  technique  is  in  order.  Consider  a  system  whose  wiring 
diagram1  is  shown  below: 


"Advanced  Concepts  in  Fault  Tree  Analysis"  by  David  Haasl. 
Paper  presented  at  System  Safety  Symposium,  Seattle,  WA  (1965) . 
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When  the  Switch  is  closed,  the  Timer  Coil  is  energized 
closing  the  Timer  Contacts  which  puts  power  on  the  Relay 
Coil.  With  the  Relay  Coil  energized,  the  relay  contacts 
close,  permitting  current  flow  to  the  motor.  It  has  been 
determined  that  overheating  of  the  wire  from  point  A  to 
point  B  is  critical  to  the  system  operation,  and  a  Fault 
Tree  is  constructed  with  the  Head  Event  "Overheated  Wire". 

Examination  of  the  system  shows  that  two  events  may 
produce  an  overheated  wire,  "Excessive  current  in  the  motor 
system  wiring"  and  "Power  applied  to  the  system  for  an  ex¬ 
tended  time" .  It  is  also  seen  that  both  of  these  occurrences 
must  happen  in  concert.  In  other  words,  the  "Overheated 
Wire"  is  a  result  of  "Excessive  Current"  AND  "Power  Applied 
for  an  Extended  Time".  The  logic  event  is,  therefore,  an 
AND  gate. 

To  determine  why  the  power  may  be  applied  for  an  ex¬ 
tended  time,  we  examine  the  circuit  and  note  that  this  may 
occur  if  the  power  is  not  removed  from  the  relay  coil  for 
some  reason,  or  if  the  relay  contacts  fail  in  the  closed 
position.  We  now  have  the  next  branch  of  our  Tree  with  an 
OR  gate  connecting  "Power  not  removed  from  relay  coil"  and 
"Relay  contacts  fail  closed". 

Similar  development  takes  place  proceeding  downward 
from  the  Head  Event  until  one  reaches  primary  or  secondary 
failures.  Primary  failures  are  those  that  occur  while  the 
part  or  component  is  operating  within  the  parameters  for 
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which  it  was  designed,  and  a  Secondary  failure  is  /£hen  the 
component  is  subjected  to  abnormal,  out-of-design  stresses. 

A  complete  Fault  Tree  diagram  for  the  system  tinder 
consideration  is  shown  in  Figure  2 . 

Analysis  of  a  Fault  Tree  may  be  either  qualitative  or 
quantitative.  If  the  failure  probabilities  for  each  of  the 
"end  faults"  of  Figure  2  are  known,  the  failure  probabili¬ 
ties  of  each  of  the  intermediate  branches  as  well  as  the 
failure  probability  of  the  Head  Event  may  be  computed  using 
Boolean  Algebra. 

But  even  a  qualitative  examination  of  many  Fault  Trees 
will  provide  useful  information.  With  an  AND  gate  leading 
directly  to  the  Head  Event,  it  is  seen  that  inasmuch  as 
both  of  the  next  lower  level  events  must  occur  for  the  un¬ 
desired  Head  Event  to  occur,  eliminating  either  of  the  next 
level  events  will  eliminate  the  failure  of  the  Head  Event. 

Since  the  gate  leading  to  the  "Power  applied  for  an  ex¬ 
tended  time"  is  an  OR  gate,  failure  of  either  of  the  next 
lower  level  events  will  still  produce  this  undesired  sub¬ 
event.  This  means  that  both  of  the  next  lower  events  must 
be  eliminated  to  prevent  this  sub-failure. 

On  the  other  hand,  the  gate  leading  to  "Excessive 
current  in  the  system  wiring"  is  also  an  AND  gate,  and  this 
sub-failure  will  only  occur  if  both  of  its  sub-level  events 
occur.  One  of  these  events,  "Motor  failed  shorted"  will 
occur  only  with  a  primary  motor  failure,  leading  us  to  the 
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conclusion  that  a  high  quality,  high  reliability  motor  will 
prevent  (or  at  least  reduce  the  probability  of  the  occurrence 
of)  the  Head  Event. 

We  see  also  that  if  we  can  prevent  the  fuse  from  failing 
to  open,  we  can  also  prevent  excessive  wiring.  This  may  be 
accomplished  by  both  preventing  the  insertion  of  an  over¬ 
sized  fuse  (design  of  the  fuse  holder)  and  prevention  of  a 
primary  fuse  failure  in  which  the  fuse  fails  to  open  with 
excessive  current.  With  most  fuse  designs,  this  type  of 
fuse  failure  should  be  extremely  remote. 

2.  Fault  Hazard  Analysis  -  A  second  technique  cited 
in  MIL-STD  882A  is  the  Fault  Hazard  Analysis  (FHA) ,  which 
is  based  on  the  techniques  long  in  use  by  Reliability  called 
Failure  Mode  and  Effect  Analysis  ( FMEA)  or  Failure  Mode, 
Effect,  and  Criticality  Analysis  ( FMECA) .  The  Fault  Hazard 
Analysis,  unlike  the  Fault  Tree  Analysis  starts  at  the 
bottom  and  works  up,  rather  than  at  the  top  working  down. 
Whereas,  in  the  Fault  Tree  Analysis  we  conducted  our 
analysis  to  determine  what  failures  would  cause  an  undesired 
event,  in  the  Fault  Hazard  Analysis,  one  starts  with  the 
part  or  component  failure  and  determines  the  ultimate  effect 
of  that  failure. 

The  Fault  Hazard  Analysis  may  be  used  in  detailed 
examination,  such  as  in  an  SHA  where  one  is  considering 
failure  modes  and  effects  of  detailed  designs,  or  it  may  be 
used  in  a  less  structured  format  in  an  overall  analysis  of 
design  concepts,  as  in  the  Preliminary  Hazard  Analysis. 
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As  with  the  Fault  Tree  Analysis,  the  Fault  Hazard 
Analysis  may  be  either  qualitative  or  quantitative.  A 
qualitative  analysis  usually  precedes  any  quantification. 

3.  Sneak  Circuit  Analysis  -  A  third  technique 
referenced  in  MIL-STD  882A  is  that  of  the  Sneak  Circuit 
Analysis.  This  is  a  technique,  originally  developed  by  the 
Boeing  Aerospace  Company,  to  locate  hazards  that  might 
occur  without  a  failure  in  the  system.  A  classic  example 
of  a  Sneak  Circuit  was  in  a  1960's  imported  car  in  which 
the  radio  and  the  brake  lights  both  received  their  power 
from  a  common  terminal  on  the  switch.  The  brake  lights  were 
also  capable  of  being  powered  through  the  emergency  flasher 
module,  and  this  portion  of  the  wiring  was  such  that  when 
the  brake  pedal  was  depressed,  even  with  the  ignition  switch 
off,  the  brake  lights  were  illuminated  -  and  the  radio  came 
on  I 

In  addition  to  these  three  techniques  cited  in  MIL-STD 
882A,  there  are  many  others,  lesser  known  analysis  techniques 
These  include  Energy  Transfer  techniques  wherein  all  sub¬ 
systems  and  systems  are  considered  on  the  basis  of  energy  in 
energy  out,  and  Resource  Utilization  techniques  that  consider 
desired  (and  undesired)  outputs  codified  on  the  basis  of  in¬ 
put  resources . 
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IV.  SOFTWARE  ANALYSIS  -  GENERAL 

Software  reliability  predictions  are  available  that 
provide  numerical  predictions  of  how  many  errors  will  remain 
in  the  software  when  it  is  delivered,  but  these  predictions 
do  not  tell  the  effects  of  a  software  error,  where  it  will 
occur,  or  in  what  mission  phase. 

Several  analysis  techniques  have  been  used  to  give 
partial  answers  to  the  software  system  safety  questions, 
but  these  have  been,  in  general,  limited  to  individual 
disciplines  and/or  to  specific  categories  of  failure. 

There  is,  however,  one  factor  that  acts  as  a  catalyst 
for  the  solution  of  the  software  system  safety  problem, 
and  that  is  the  fact  that  any  software  difficulty  is  tran¬ 
slatable  into  a  general  system  safety  problem  only  if  there 
is  a  hardware  involvement.  That  is  to  say,  the  only  soft¬ 
ware  'mishap"  that  has  safety  implication  is  one  that 
commands  a  hardware  function  at  the  wrong  time,  in  the  wrong 
sequence,  in  the  wrong  manner  or  when  the  hardware  component 
should  not  be  commanded  at  all. 

This  fact,  in  itself,  gives  rise  to  possible  software 
system  safety  techniques  in  which  the  software  'mishap'  is 
directly  related  to  hardware  'mishaps'  for  which  there  are 
well  known  and  experienced  analysis  techniques,  as  discussed 
in  Part  III.  It  would  appear,  therefore,  that  an  examina¬ 
tion  of  each  known  hardware  mishap  for  possible  software 
command  functions  would  suffice  as  a  software  system  safety 
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analysis.  And  indeed  it  would,  except  that  this  wholesale 
approach  would  be  extremely  time  consuming  and,  as  a  result, 
extremely  costly.  One  would  also  have  to  exercise  great 
caution  to  ensure  that  software  interfaces  are  considered  in 
addition  to  the  already  considered  hardware  interfaces. 

Although  there  are  many  proponents  of  various  single 
technique  methods  for  software  system  safety  analysis,  even 
these  techniques  are  generally  modified  so  as  to  take 
advantage  of  other  techniques  in  some  degree  to  reduce  the 
cost  and  time  that  would  be  otherwise  involved. 

In  general,  software  analysis  techniques  for  system 
safety  follow  the  hardware  analysis  methods.  One  method 
is  the  'bottom  up'  method  similar  to  the  Failure  Mode  and 
Effect  Analysis  and  another  is  the  'top  down'  procedure 
related  to  the  Fault  Tree  Analysis.  Some  detailed  examina¬ 
tion  of  these  two  methods,  as  well  as  a  combined  method  will 
be  discussed  in  subsequent  parts  of  this  summary. 
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V.  TOP  DOWN  SOFTWARE  ANALYSIS 


The  Top  Down  analysis  technique  is  based  on  the  deter¬ 
mination  of  hazardous  termination  points  for  the  software 
program.  This  is  analogous  to  the  Undesired  Event  which  is 
the  Head  Event  of  the  hardware  Fault  Tree  Analysis. 

The  first  step  in  the  Top  Down  techniques  is  to  define 
Acceptable  Terminations  as  well  as  Hazardous  Terminations. 
Such  a  definition  list,  as  related  to  a  missile  system,  is 
as  follows:^ 

Acceptable  Defined  Terminations 

1.  Missile  successfully  launched. 

2.  Missile  aborted  with  booster  and  warhead  safe. 

3.  Missile  aborted  with  booster  or  warhead  not 
saved  and  with  operator  alerts. 

4.  System  cycled  to  Hold  with  operator  alerts. 

5.  System  cycled  to  Test  Mode  with  operator  alerts. 

6.  System  recycled  to  known  states  with  operator 
alerts. 

7.  System  automatically  cycled  to  power  down  with 
appropriate  operator  alerts. 

Hazardous  Terminations 

1.  Unauthorized  launch  of  a  missile. 

2.  Unintentional  launch  of  a  missile,  including  the 
launch  of  a  wrong  missile. 

Software  Safety  and  Security  Analysis  Techniques  and 
Methods,  Vitro  Laboratories,  Silver  Spring,  MD,  1980. 
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3.  Missile  abort  without  operator  alerts.  The 


alerts  to  include  the  safety  evaluation  of  the 
aborted  missile. 

4.  A  premature  issuance  of  a  unique  pre-arm  signal. 

5.  An  abort  without  adequate  declassification. 

6.  An  unauthorized  disclosure  of  classified  data. 

7.  Any  unauthorized  or  unintentional  modification 
of  safety  and/or  security  sensitive  programs  or 
data. 

8.  Any  termination  which  is  not  a  member  of  the  set 
of  acceptably  defined  terminations. 

It  is  to  be  observed  that  several  of  the  Hazardous 
Termination  categories  go  well  beyond  the  safety  of  the 
hardware.  For  example.  Item  number  3,  "Missile  abort  with¬ 
out  operator  alerts"  might  appear  to  have  "Fail  Safe" 
connotations  in  that  the  missile  aborted,  presumably  due  to 
proper  software/hardware  interaction.  However,  a  new  and 
unique  safety  problem  may  arise  if  the  operator  is  unaware 
of  the  abort  or  of  the  reason  for  the  abort  and  attempts  to 
operate  the  missile  again. 

With  the  hierarchial  Top  Down  development  process,  the 
software  design  cycles  is  as  shown  in  Figure  3.  As  in  the 
hardware  system  safety,  the  best  payoff  occurs  if  the  soft¬ 
ware  system  safety  is  inserted  early  in  the  design  process. 
Although  it  is  true  that  many  systems  have  an  initial  hard¬ 
ware  design  followed  by  the  design  of  the  control  software. 
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even  these  systems  should  undergo  a  software  analysis  for 
subsystems  and  for  system  interfaces  as  early  in  the  design 
process  as  practicable. 

For  those  subsystems  and  systems  in  which  the  software 
is  "King",  that  is  to  say,  for  systems  in  which  the  software 
is  the  principal,  and  possibly  the  first,  design  item  and 
then  the  hardware  is  designed  to  accomplish  the  desired  out¬ 
puts  of  the  system,  early  analysis  of  the  software  is 
essential. 

The  flow  of  control,  as  shown  in  Figure  3,  starts  at 
the  top  level ,  goes  down  one  or  more  levels ,  comes  back  as 
required,  and  then  goes  down  to  another  level. 

The  actual  assessment  depends  on  the  construction  of  a 
series  of  binary  trees  that  have  the  following  attributes: 

1.  There  is  a  one-to-one  correspondence  between 
decision  points  of  the  software  and  nodal  branch  points  of 
the  tree. 

2.  Each  branch  cam  be  categorized  by  a  series  of 

state  transitions  that  cam  be  described  by  Booleam  alogorithms. 

3.  The  origin  of  each  binary  tree  is  always  an  operation 
action,  a  machine-generated  priority  interrupt,  a  periodic 
status  monitoring,  or  a  return  to  a  higher  level  process. 

Figure  4  is  a  generalized  Binary  State  Transition  Tree 
which  illustrates  all  constructs.  Such  a  tree  will  be  examined 
for  possible  occurrences  of  system  errors  caused  by  program 
errors  (incomplete  definition  of  state  vectors  or  predicate 
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Figure  3.  Relationship  between  Software  Analysis  and 

Software  Design 
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transforms)  or  by  single  or  multiple  hardware  failures. 
System  errors  are  treated  as  virtual  branch  points  in  which 
the  occurrence  and  subsequent  processing  is  also  treated  as 
a  binary  tree. 

Such  a  tree  is  called  a  Binary  Fault  Termination  Tree 
and  is  also  shown  in  Figure  4.  This  tree  has  end  points 
that  may  describe  hazardous  terminations,  and,  as  such,  is 
of  prime  interest  in  software  system  safety  analysis. 

As  has  been  previously  stated,  the  only  software 
'mishap*  that  is  of  concern  in  the  system  safety  analysis 
is  one  that  produces  an  undesired  hardware  event.  For  this 
reason,  software  errors  that  occur  but  are  trapped  and 
contained  and  do  not  produce  hazardous  terminations  are  of 
but  secondary  interest  in  the  safety  analysis.  The  Binary 
Fault  Termination  Trees  are  used  to  evaluate  whether 
checkpoints  are  established  to  trap  and  contain  the  software 
errors. 

When  errors  have  been  identified  and  the  processing 
shows  that  a  hazardous  termination  is  being  approached, 
the  process  may  be  translated  into  a  probability  function 
to  determine  the  probability  of  the  hazardous  termination. 
This  translation  implies  that  the  following  probabilities 
can  be  computed: 

1.  The  probability  that  the  system  moves  into  the 
implied  mode. 

2.  The  probability  that  the  system  moving  from  the 
implied  mode  misses  the  illegal  procedure  trap. 
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VI.  BOTTOM  UP  SOFTWARE  ANALYSIS 

Another  technique  that  is  used  for  software  analysis 
is  the  Bottom  Up  Analysis.  This  technique  is  based  on  the 
predication  that  any  hazard  that  is  created  or  allowed  to 
propogate  must  exist  in  hardware.  Because  of  this,  the 
software  safety  analysis  should  analyze,  within  the  software 
boundaries,  concerns  identified  within  the  hardware  facets. 
This  stipulates  that  the  software  analysis  is  an  extension  of 
the  Preliminary  Hazard  Analysis  (PHA) . 

Inasmuch  as  the  Subsystem  Hazard  Analysis  (SSHA)  and 
the  System  Hazard  Analysis  (SHA)  follow  the  Preliminary 
Hazard  Analysis,  the  command  and  control  software  development 
and  analysis  should  follow  the  basic  development  of  the  hard- 
ware  that  is  to  be  controlled. 

This  offers  the  use  of  the  techniques  similar  to  the 
Fault  Hazard  Analyses  of  hardware  system  safety.  This 
technique,  which  is  akin  to  the  long-used  Reliability  tech¬ 
niques  of  Failure  Mode  and  Effect  Analysis  (FMEA)  or  the 
Failure  Mode,  Effect  and  Criticality  Analysis  (FMECA) , 
considers  the  undesired  outcomes  of  the  failure  of  a  sub¬ 
system  or  component. 

The  software  analyses  should  address  hazards  resulting 
from  basic  deficiencies  in  the  requirements,  the  software 
program  design,  the  internal  coding,  the  software  testing, 
the  user  interfaces,  the  backup  software  functions  and  the 
hardware/software  interfaces. 


The  general  approach  for  this  technique  is  as  follows:^- 

1.  Divide  the  system  into  portions  to  gain  insight  into 
sub-function  operations. 

2.  Conduct  the  hardware  system  safety  analysis. 

3.  Conduct  the  software  system  safety  analysis. 

a.  Analyze  for  correct  implementation  of  the 
hardware  functional  requirements  and  interfaces. 

(1)  Review  the  functional  requirements 
of  the  subsystems  being  controlled. 

(2)  Analyze  the  requirements  of  the 
software  definition  and  implementation  to  ensure  that 
the  software  is  kept  in  a  safe  configuration. 

b.  Analyze  for  internal  software  anomalies  which 
would  affect  the  execution  of  the  software. 

(1)  Analyze  those  functions  that  have  been 
designated  safety  critical. 

(2)  Analyze  those  functions  that  might  affect 
the  execution  of  the  safety  critical  functions. 

(3)  Review  for  implementation  of  errors  or 
anomalies  in  the  detailed  requirements,  the  detailed  program 
design  and/or  the  coding. 

(4)  Review  the  timing  of  competing/overlapping 

functions. 

Software  Safety  Analysis,  A  Presentation  by  John  G. 
Griggs,  Martin  Marietta  Corp.  at  Tri-Service  Safety  Conference 
Colorado  Springs,  CO,  September  1980. 
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(5)  Review  the  use  of  incorrect  data  and/or 
changing  data. 

(6)  Review  the  use  of  coding  techniques  to 
minimize  the  effects  of  errors. 
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VII.  COMBINED  TECHNIQUE 


A  combination  of  the  Top  Down  and  the  Bottom  Up 
Techniques  has  been  developed  by  the  Boeing  Aerospace  Company^ 
into  a  technique  called  Integrated  Path  Critical  Analysis 
(ICPA)  . 

The  seven  step  ICPA  combines  Fault  Tree  Analysis  for  the 
identification  of  critical  paths.  Failure  Mode  and  Effect 
Analysis  for  component  failure  rates.  Sneak  Circuit  Analysis 
for  actual  detailed  system  configuration  Component  Sensi¬ 
tivity  Analysis  for  the  determination  of  areas  of  emphasis 
and  Critical  Path  Analysis  to  evaluate  and  quantify  overall 
system  reliability  and  safety. 

The  steps  outlined  by  Boeing  Aerospace  for  this 
technique  are  as  follows: 

1.  Determine  the  Scope  of  the  Analysis. 

a.  This  is  usually  accomplished  by  reviewing  the 
top  level  hazards  and/or  critical  functions  and  then  deciding 
which  ones  should  be  analyzed  in  detail  and  to  what  level. 

This  may  be  done  by  using  existing  Fault  Trees  and  FMEA's 
and  marking  them  for  particular  emphasis. 

2.  Establishment  of  Accurate  System  Configuration  in 
the  form  of  Integrated  Functional  Network  Trees. 

- , - 

Software/Hardware  Integrated  Critical  Path  Analysis 
(ICPA)  by  J,  H,  Campbell  and  F,  H.  Tuna  in  Proceedings  of 
Fourth  International  Systetn  Safety  Conference,  July  9-13,  1979. 
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a.  It  is  essential  that  the  analysis  be  conducted 
on  an  accurately  represented  system.  The  use  of  logic  trees 
assists  in  the  determination  of  the  accuracy  of  the  system 
configuration  under  examination. 

3.  Updating  and  Integrating  Fault  Trees. 

a.  Here  one  applies  the  material  generated  from 
Step  #2  and  adds  the  software  network  threats  which  change 
the  impact  of  the  hardware  due  to  usage  and/or  interconnect 
hardware,  thereby  changing  the  hardware  interrelationships. 

4.  Reliability  Prediction  of  Critical  Paths. 

a.  Once  the  critical  paths  have  been  defined,  the 
reliability  numbers  can  be  affixed  to  present  a  reliability 
prediction. 

5.  Procedure  Analysis. 

a.  This  step  considers  all  procedures  including 
test,  contingency  and  backup. 

6.  Failure  Effect  Analysis. 

a.  The  previous  steps  have  emphasized  the  possible 
critical  steps  and  paths,  and  these  functions  are  now  analyzed. 

7.  Analyses  Reports. 

a.  These  reports  are  generated  to  permit  management 
decisions  in  regard  to  changes  and  alterations  to  the  suspect 
subsystems  and  systems.  These  reports  should  contain  the 
revised  Fault  Trees,  sensitivity  statements,  hardware/software 
interaction  problems  and  the  failure  effects  of  critical  paths. 
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VIII.  CONCLUSIONS 


The  subject  of  software  system  safety  is  relatively  new. 
Concern  in  the  Electronic  Industries  Association  (EIA)  was 
first  expressed  about  a  year  and  a  half  ago  and  the  EIA 
formed  a  task  force  to  develop  a  generic  approach  to  the 
analysis  of  software  system  safety  in  August  1979. 

This  action  is  significant  due  to  the  activity  of  the 
G-48  Committee  of  the  EIA  in  developing  codified  System 
Safety  requirements  and  fostering  the  development  of  System 
Safety  analysis  techniques. 

There  appear  to  be  techniques  available  for  the  analysis 
of  System  Safety  software  problems,  and  these  techniques  are, 
in  general,  based  on  the  proven  techniques  used  in  hardware 
System  Safety  analysis. 

Despite  the  particular  technique  to  be  used,  it  appears 
that  software  analysis  parallel  with  software  development  has 
the  greatest  payoff  at  this  time. 

Although  the  analysis  techniques  in  greatest  use  to  date 
have  been,  at  the  most,  a  modification  of  the  hardware  tech¬ 
niques,  there  are  people  working  on  different  techniques  which 
have,  as  yet,  not  demonstrated  any  spectacular  results.  This 
is  not  to  say,  however,  that  such  methods  are  not  to  be  closely 
monitored,  because  it  is  quite  possible  that  these,  rather 
radical,  methods  may  yet  have  a  significant  payoff. 
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